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ABSTRACT 


Basic  considerations  are  discussed  for  determining  sample  sizes 
and  record  lengths  for  various  statistical  tests  and  estimates  which  are 
important  to  random  fatigue  testing.  Methods  for  determining  minimum 
sample  sizes  when  comparing  means  and  variances  of  normally  (Gaussian) 
distributed  random  variables  are  described.  Procedures  for  reducing 
a  relatively  large  sample  to  a  smaller  sample  are  presented.  Elimina¬ 
tion  of  outliers  and  systematic  resampling  are  two  methods  given. 


An  explanation  is  presented  of  the  requirements  and  problems  in¬ 
volved  in  the  determination  of  record  lengths  necessary  for  an  estimate 
of  a  given  accuracy  for  autocorrelation  functions,  ordinary  power  spectral 
density  functions,  cross-correlation  functions,  cross -spectral  density 
functions,  frequency  response  functions,  and  probability  density  functions. 

Due  to  its  importance  in  random  fatigue  testing  applications,  the 
basic  properties  of  the  Weibull  distribution  in  terms  of  its  parameters 
and  the  failure  rate  are  summarized.  A  presentation  is  given  of  esti¬ 
mation  and  statistical  testing  problems  related  to  the  Weibull  distribution. 
The  best  available  methods  of  estimating  the  parameter  s  are  described  . 
Methods  of  determining  sample  sizes  needed  for  various  analyses  are 
developed.  Some  problems  of  reliability  analysis  applicable  in  fatigue 
te sting  are  discus sed.  New  methods  of  decision  techniques  for  compar 
ing  two  or  more  systems  are  proposed  in  terms  of  reliability.  The 
report  concludes  with  an  example  of  the  application  of  the  Weibull  distri¬ 
bution  to  actual  fatigue  test  data. 
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RANDOM  FATIGUE  TEST  SAMPLING  REQUIREMENTS 


1.  INTRODUCTION 

In  random  fatigue  testing,  many  different  statistical  parameters  are 
of  interest  in  the  various  facets  of  a  given  test.  This  report  discusses 
some  of  the  sample  size  requirements  which  are  necessary  to  estimate 
these  parameters.  Of  central  importance  to  fatigue  life  testing  is  the 
Weibull  distribution.  This  is  so  because  the  probability  distribution  of 
parameters  such  as  time  to  failure  and  cycles  to  failure  are  usually  reason¬ 
ably  modeled  by  the  Weibull  distribution.  Various  statistical  aspects  of  the 
Weibull  distribution  are  discussed  herein. 

In  many  experimental  design  situations,  an  assumption  of  a  Gaussian 
distribution  of  various  sample  statistics  is  invoked  in  order  to  allow  the 
estimation  of  required  sample  sizes  or  record  lengths.  This  assumption  is 
almost  always  justified  if  the  sample  sizes  one  is  concerned  with  are  large, 
say  greater  than  thirty.  The  central  limit  theorem  guarantees  Gaussian 
distributions  for  large  N.  In  other  situations,  one  is  forced  into  Gaussian 
assumptions  for  small  sample  sizes  due  to  the  lack  of  an  available  exact 
theory.  Any  requirements  based  on  Gaussian  assumptions  in  this  case 
become  questionable  but  are  at  least  reasonably  proper  guidelines  for 
experiment  planning  purposes.  Hence,  although  most  of  the  results  in  this 
report  are  based  on  Gaussian  assumptions,  they  are  in  practice  usefully 
applied  to  most  practical  problems. 

The  two  most  fundamental  quantities  in  statistics  are  mean  values 
and  variances.  The  first  section  following,  therefore,  discusses  sample  size 
requirements  for  means  and  variances  assuming  a  Gaussian  distribution. 
Often,  a  large  sample  of  data  will  be  collected  which  must  be  reduced  to  a 
smaller  more  tractable  size.  Some  of  the  ideas  involved  in  eliminating  un¬ 
wanted  data  points  {outliers)  and  methods  for  over -all  reduction  of  sample 
size  are  presented.  Later  sections  discuss  correlation  function,  power 


1 


spectral  density  function,  and  frequency  response  function  estimates,  and 
are  presented  in  terms  of  an  allowable  percentage  normalized  standard 
error.  The  final  section  describes  fatigue  life  testing  applications  of  the 
Weibull  distribution. 

Two  different  approaches  are  used  for  determining  sample  size 
requirements.  In  Section  2,  it  is  assumed  that  some  reason  exists  for 
hypothesizing  a  specific  value  for  a  population  parameter.  One  may  then 
calculate  the  sample  size  necessary  to  detect  a  specified  deviation  from 
this  hypothesized  value  with  a  given  probability.  This  is  the  method  to  use 
when  one  has: 

i)  a  specific  value  predicted  by  theory  against  which  to  test 
(for  example,  the  theoretical  expected  number  of  runs  in  a 
sample  of  N  independent  observations  is  1  )- 

ii)  a  measured  known  value,  possibly  from  previous  experiments, 
and  one  is  hypothesizing  the  new  data  to  be  significantly 
different  (i.  e.  ,  testing  a  supposedly  improved  product 


The  other  approach  is  that  of  computing  the  sample  size  necessary  to 
estimate  a  parameter  with  a  given  percentage  error  as  opposed  to  specifying 
a  specific  value.  The  normalized  standard  error  is  employed.  This  is  the 
square  root  of  variance  (the  standard  error)  of  the  estimate  divided  by  its 
expected  value  (normalized)  to  give  variability  in  a  percentage  form.  The 
pitfall  in  this  concept  lies  in  attaching  undue  importance  to  a  deviation  of 
one  standard  deviation  (rms  value).  Deviations  of  plus  and  minus  one  standard 
deviation  occur  with  a  given  probability  and  are  of  no  more  importance  than 
deviations  of,  say,  plus  and  minus  two  or  plus  and  minus  one -half  standard 
deviations.  Therefore,  in  quoting  results  or  performing  calculations  one 
must  be  careful  to  note  that  one  allows  a  deviation  of  a  given  percentage  with 
a  specific  probability,  and  that  rms  values  are  not  maximum  errors  which 
occur. 
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2.  sample  size  calculations  for  equi¬ 
valence  OF  MEANS  AND  VARIANCES 

In  any  situation  where  one  knows  or  hypothesizes  a  mean  and  variance 
o£  a  Gaussian  distribution,  one  may  compute  sample  sizes  necessary  to 
properly  test  sample  values  against  these  theoretical  values.  Certain  con¬ 
straints  must  be  imposed  on  the  problem,  such  as  specifying  the  level  of 
significance  and  probability  of  Type  II  Error,  which  are  explained  below. 

The  sample  size  may  then  be  calculated  which  maintains  these  probabilities. 

In  other  cases  one  may  be  able  to  calculate  required  sample  sizes  based  on 
a  requirement  to  estimate  a  parameter  with  a  specified  percentage  {rms) 
error. 

2 

A  theoretical  mean  p  and  variance  a  can  sometimes  be  computed 
for  a  given  distribution  {or  one  can  assume  values).  Using  these  theoretical 
values,  one  can  then  test  obtained  sample  values  to  determine  if  the  observed 
distribution  can  be  considered  to  be  the  same  as  the  theoretical  distribution. 

In  this  case  the  statistical  hypothesis  is:  "There  is  no  evidence  to  conclude 
that  the  sample  values  are  not  the  same  as  the  theoretical  values."  These 
will  be  two-tailed  tests,  since  deviations  from  the  hypothesized  values  may 
occur  in  either  direction. 

Two  types  of  errors  can  be  made: 

Type  I  Error  -  Rejecting  the  hypothesis  when  it  is  really  true 
with  probability  « 

Type  II  Error  -  Accepting  the  hypothesis  when  it  is  really  false 
with  probability  p 

To  illustrate  these  two  errors,  one  only  needs  to  consider  the  sample  mean 
values  computed  from  two  different  random  samples  of  observations  drawn 
from  the  same  underlying  population.  Clearly,  with  a  certain  small  proba¬ 
bility,  say  a  =  10%,  these  sample  mean  values  might  differ  enough  to 
appear  truly  different.  This  is  the  Type  I  Error.  On  the  other  hand,  if 
random  samples  are  collected  from  two  slightly  different  populations,  clearly 
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by  chance  {say  with  probability  p  =  10%),  the  mean  values  computed  from 
these  two  samples  might  be  so  close  together  that  they  appear  equivalent* 
This  is  the  Type  II  Error* 

As  can  be  seen  from  this  example,  the  farther  apart  the  populations 
truly  are,  the  smaller  is  the  chance  of  the  sample  mean  values  appearing 
equivalent.  Hence,  in  addition  to  specifying  a  and  p,  one  must  impose  an 
additional  restraint  on  the  problem  to  allow  the  sample  size  to  be  calculated. 
That  is,  one  must  specify  what  particular  deviation  from  the  hypothesized 
parameter  will  allow  the  hypothesis  to  be  accepted  with  probability  P*  In 
some  specific  situations  one  might  have  suspicions  about  the  theory  involved 
and  anticipate  some  particular  value  other  than  the  hypothesized  value.  In 
other  cases  one  must  use  judgment  in  selecting  values  somewhat  arbitrarily. 

For  the  illustrative  examples  in  this  section,  a  ten  percent  difference 
in  means  and  a  fifty  percent  difference  in  standard  deviations  are  selected  as 
the  values  at  which  the  probability  of  Type  II  Error  will  be  held*  Of  course, 
other  deviations  from  the  theoretical  values  may  be  chosen,  and  have  a 
specific  associated  probability  of  the  hypothesis  being  accepted.  Also,  for 
simplicity,  a  and  p  will  be  chosen  each  equal  to  10%, 


4 


2. 1  sample  size  for  equivalence  of  means 


The  calculation  of  the  required  sample  size  for  the  test  of  equivalent 
mean  values  is  as  follows:  Let  p*  be  the  mean  value  of  the  distribution 
which  is  to  be  detected  with  a  probability  p  =  10%,  and  z^  ^  normal 

(Gaussian)  deviate  such  that 


Prob  {z  <  Zj  fi/2}  ~  ^  -a/Z 


That  is, 


P 


(1) 


The  reason  for  using  (xjZ  instead  of  o  is  to  allow  for  two-sided  deviations 
so  that  the  Type  I  Error  is 

The  following  relations  now  hold  where  is  the  critical  point  (see 

Figure  1).  That  is,  x^  is  a  value  such  that  if  x  >  ,  the  hypothesis  is 

rejected,  and  if  x  <  x^  ,  the  hypothesis  is  accepted  where  x  is  the  calcu¬ 
lated  sample  mean.  The  sample  mean  x  is  defined  by  the  equation 


N 


1=1 


(2) 


where  x.  are  the  observations  which  make  up  the  sample. 

1 

Figure  1  is  not  quite  complete  in  the  sense  that  deviations  in  either 
direction  are  considered.  However,  the  symmetry  is  implied  by  using 
flr/ 2  and  p/2  rather  than  a  and  p.  In  terms  of  Of/2, 


*l-«/2  = 


(3) 
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while,  in  terms  of  P/2, 


^P/2" 


(4) 


where 


O'  is  the  theoretical  variance  of  the  distribution  being  sampled. 


Prob  (x  <  x^)  =P/2  ;  Type  II  Error 


In  the  special  situation  where  one  sets  a  =  p,  it  follows  that 

Also,  due  to  the  symmetry  of  the  normal  distribution,  z  ,  =  - 

1  -a/ 2 

Hence,  from  Eqs.  (3)  and  (4), 


V2 

a/2  ■ 


z 


p/2. 


0-/ V~N 


Solving  for 


X 

c 


one  obtains 


X 

c 


M-  + 


2 


(5) 
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Then  substituting  back  in  Eq,  (4), 


P- 

2  ^  2 

0-/  Vn  0-/ 


Letting  Afx  =  ^  -  fji',  and  solving  for  N: 


N 


2  2 

V2 

(Afi/2)^ 


=  4 


^  Afx  or/  2 


(6) 


Note  that  the  implicit  assumption  has  been  made  that  cr^  =  ((r')^  where  (cr')^ 

is  the  alternative  variance.  Therefore,  this  test  would  be  properly  performed 

after  the  alternative  variance  (o-')^  had  been  determined  to  be  statistically 

2 

equivalent  to  the  theoretical  value  a  , 


Computational  Example 

The  calculation  of  N  based  on  the  test  for  equivalent  means  is  illustrate* 
as  follows.  Assume  from  independent  considerations  one  obtains  the  theore¬ 
tical  values 


P 

2 


50 

25 


The  required  sample  size  to  detect  a  10%  difference  in  means  {namely 
Ap  =  5  here)  with  a  Type  II  Error  of  p  =  10%  then  is  calculated  by  applying 
Eq.  (6).  For  0=  10%,  the  term  z  =  1.645.  Thus 

.  73 


N  = 


{1. 645)^  11 


This  sample  size  of  N  =  11  will  be  compared  later  to  the  sainple  size 
required  for  equivalent  variances. 
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2.  2  SAMPLE  SIZE  FOR  EQUIVALENCE  OF  VARIANCES 

The  reasoning  for  obtaining  a  formula  to  compute  the  sample  size  for 

the  variance  equality  test  proceeds  in  a  similar  manner.  Let  s  ^  represent 

2  2  2  ^ 

the  critical  point.  Ihen,  since  (N“l)  s  /o'  has  a  distribution  with  (N  -  1)  d,  f.  , 
one  has 

2  2 

®c"(N-l)Xp/2  (7) 

and 

2  cr^  ^ 

(8) 

where  ^l-o/2  points  of  the  distribution  with  (N-  1)  d.  f. 

The  sample  (uiiia^d) variance  s^  is  defined  by  the  formula 


N 


2 

5 


(9) 


Equating  (7)  and  (8)  and  rearranging  terms  gives  for  the  case  o  =  p. 

7  2 

(cr')''  ^l-a/2 

z  =  — —  (10) 

K/z 

Although (N  -  1)  cancels  out,  x^  is  a  function  of  (N  -  1).  Therefore,  when  o-^ 

2  2 
and  (o-*)  are  specified,  a  trial  and  error  inspection  of  a  x  table  will  give 

values  of  x  for  some  number  of  d.  f.  such  that  Eq.  (10)  holds  true. 
Computational  Example 

F or  example,  for  (N  -  1)  =  29  d.  f.  ,  one  finds  in  the  x^  table 


2 

05  _  42.  6 

2  ■  17.  7 

X 


2.41 
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For  all  practical  purposes  this  corresponds  to  the  desired  ratio  of  standard 
deviations  of  1.55  and  1,0.  Therefore,  a  convenient  sample  size  to  test  for 

variance  equivalence  is  30. 

Note  that  the  variance  equivalence  test  has  a  larger  required  sample 
size  than  the  mean  equivalence  test,  namely  N  =  30  as  compared  to  N  =  11. 
Therefore,  this  would  determine  the  over -all  sample  size  for  the  experiment 
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3,  REDUCTION  OF  SAMPLE  SIZE 


Suppose  that  it  is  desired  to  reduce  a  large  sample  of  size  M  to  a 
smaller  size  n  (n  <  M),  The  purpose  of  this  section  is  to  describe  the 
method  of  reduction.  It  is  not  intended  here  to  discuss  how  to  obtain  the 
original  sample  of  M  data  points  but  the  method  also  applies  to  the  original 
sampling  since  one  can  consider  that  an  infinite  amount  of  data  is  reduced  to 
M  data  points.  Assume  that  all  data  are  from  the  same  population.  If 
one  suspects  that  they  come  from  two  different  populations,  one  has  to 
partition  them  into  two  disjoint  groups  before  the  analysis  is  performed.  That 
problem  belongs  to  the  topic  of  classification  analysis.  It  is  not  discussed  here. 

The  reduction  is  conceived  in  two  steps; 

1,  eliminate  all  bad  observations  (outliers) 

2p  reduce  a  sample  consisting  of  a  large  number  of  data 
points  to  a  smaller  representative  sample  for  detailed 
analysis 

3,  1  METHODS  OF  REDUCTION 

Step  1,  Elimination  of  Outliers  (Bad  Observations) 

Often  a  sample  data  of  size  M  contains  some  erroneous  data  which  are 
called  outliers.  These  errors  result  from  such  factors  as  instrumentation 
or  human  errors, 

A  statistic  which  is  used  to  detect  outliers  is  R/s,  the  range  divided  by 
the  sample  standard  deviation.  The  sample  standard  deviation  s  is  the 
independent  external  estimate  of  the  standard  deviation  obtained  from  con¬ 
current  or  past  data,  not  from  the  sample  on  hand,  A  test  of  outliers  can  be 
performed  if  the  percentile  points  of  the  R/s  are  available.  These  percen¬ 
tile  points  when  the  underlying  data  are  from  a  normal  distribution  are  shown 
in  Table  1  (Reference  1), 
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n 

d.  f^\ 

5 

10 

15 

20 

20 

4.  23 

5.  01 

5.  43 

5.  71 

30 

4.  10 

4.  83 

5.  21 

5.  48 

40 

4.  04 

4.  74 

5.  11 

5.  36 

60 

3.  98 

4.  65 

5.  00 

5.  24 

120 

3.  92 

4.  56 

4.  90 

5.  13 

CO 

3.  86 

4.  47 

4.  80 

5.  01 

Table  1.  Table  of  95  Percentiles,  C{,95),  of  the  Distribution  of  R/ s 

[The  parameter  n  is  the  sample  size  and  d.  f.  is  the  number 

of  degrees -of -freedom  in  the  independent  standard  deviation  s*. 


Let  ,  .  .  .  ,  Xj^  be  the  sample  of  size  M,  Denote  x^  =  Min  ^x. 

and  Xj^  =  Max  ,  Then  R  =  x^^  -  x^^  and  the  critical  region  for 

rejection  is  R/s  >  c(af),  where  cfa)  is  IGOo  percentile  point  from  the 
available  table.  If  R/s  >  c(o')  ,  the  rejection  rule  is: 

reject  x^^  if  (x  -  x^)  >  (x^  -  ic) 

reject  x^^  if  (x  -  x^)  <{Xj^  -  x) 

reject  both  x^  and  x^^  if  (x  -  x^)  =  (x^^  -  x) 
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where  x  is  the  mean. 


One  application  of  the  above  technique  in  fatigue  testing  is  as  follows. 
Suppose  that  one  has  a  sample  of  large  size  on  hand.  Now  assume  that  a 
smaller  second  sample  is  obtained  from  the  same  system.  It  is  suspected 
that  a  few  data  points  of  the  second  group  are  set  apart  from  the  others.  One 
wonders  whether  or  not  they  are  far  enough  from  the  others  so  that  one  can 
reject  them  as  being  caused  by  some  assignable  but  thus  far  unascertained 
cause.  Now^  one  applies  the  above  technique  to  decide  whether  the  data 
should  be  kept  or  not.  In  this  case  the  range  is  computed  from  the  second 
sample  and  the  standard  deviation  is  computed  from  the  first  group  to  apply 
the  rejection  rule  described  in  Step  1, 

If  rejection  occurs,  then  the  sample  size  is  reduced  to  (M  -  1)  or 
(M  -  2)  from  M,  Now  a  new  range  and  a  new  mean  are  computed 

from  the  remaining  data.  Then  the  same  procedure  as  described  above  is 
applied  to  detect  the  next  possible  outlier (s).  The  procedure  is  continued 
recursively  until  R/s  <  c(a).  Let  N  denote  the  reduced  sample  size  from 
which  all  outliers  have  been  removed. 

Example  1:  Assume  that  the  following  21  measurements  are  made  from  a 

normally  distributed  record.  Further  assume  that  the 
measurements  are  made  sequentially  at  fixed  intervals  of  time. 


52 

55 

56 

49 

33 

56 

44 

55 

43 

40 

24 

44 

41 

39 

45 

59 

36 

51 

44 

45 

45 

Suppose  that  it  is  desired  to  check  for  possible  outliers.  Assume 
that  the  above  data  are  obtained  from  the  same  source  as  1000 
previous  data  in  which  sample  standard  deviation  was  found  to 
be  s  =  6.5.  Then  proceed  as  follows* 

x=45,52  c(, 9  5)  -  5,01 

R  =  59  -  24  =  35  5,  38  >  5,01 

R/s  =  35/6,5=  5,38 
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Therefore,  outlier{s)  are  indicated.  Since  (x  -  24)  ^  (59  ~ 
one  rejects  24  as  being  an  outlier  from  the  above  data  at  95% 
confidence  level.  Next,  one  computes  that 


X  =  46.  6  (new  mean) 

=  59  -  33  =  26 

R-/s  =  26/6.5  =  4.00  <  5.01  =  c{.  95) 


Thus,  all  the  rest  of  the  data  are  kept  as  good  observations. 
The  variance  of  20  remaining  data  points  is  52.  46. 


Step  2.  Resampling 

Now  assume  that  it  is  desired  to  reduce  the  sample  size  from  N  to  n, 

(n  <  N).  Suppose  that  the  N  data  points  are  numbered  1  to  N  in  an  arbitrary- 
manner.  Three  methods  of  reduction  are  discussed  below, 
a)  Simple  Random  Reduction 


This  method  is  the  simplest  one.  One  simply  randomly  selects  n 
points  out  of  the  sample  of  size  N.  A  convenient  method  of  selecting  random 
samples  is  to  apply  commonly  available  "random  number"  tables.  If  one 
reads  23,  4,  13,  .  .  .  from  a  table,  then  one  selects  the  23rd,  4th,  13th,  .  .  . 
data  points  from  the  N  data  until  a  total  of  n  data  points  is  obtained.  The 
variance  of  the  mean,  Var(x),  of  the  selected  data  in  terms  of  original  N 

data  is 


Var  (  X  )  = 


(N  -  n) 
Nn 


2 

s 


(11) 


where  s^  is  the  sample  variance  of  the  N  data  points.  Sometimes  it  is 
descriptive  to  talk  about  the  precision  of  the  estimate.  Precision  is  defined 
as  the  reciprocal  of  the  variance.  Thus,  in  the  case  of  Eq,  (11),  the 
precision  of  x  obtained  by  a  simple  random  reduction  is 


13 


P(x)  = 


Nn 


(N  -  n)  s 


2 


It  refers  to  the  measure  of  precision  of  x  obtained  by  repeated  application  of 
the  same  reduction  procedure.  It  is  obvious  that  the  less  the  variance,  the 
higher  the  precision  of  any  estimate. 


b)  Systematic  Reduction 

Let  k  =  [N/n]  be  the  greatest  integer  not  larger  than  the  quantity  N/n. 


Partition  the  N  data  points  into  n  disjoint  subgroups  of  size  k.  In  sampling 
theory  these  subgroups  are  called  strata.  Since  N  is  not,  in  general,  an 
integral  multiple  of  n,  different  strata  may  vary  by  one  data  point  in  size. 
Select  one  sample  data  point  from  the  first  stratum  at  random  and  every  kth 
data  point  thereafter.  The  variance  of  systematically  reduced  data  is 


where  s^  is  the  variance  of  the  N  data  points.  x, ,  denotes  the  jth  sample 
point  of  ith  stratum  and  x,  denotes  the  mean  of  ith  sample.  Equation  (12)  is 


1 


proved  in  Reference  2,  When  Eq.(ll)  is  compared  with  Eq,  (12),  one  can  state 
that  the  mean  of  a  systematically  reduced  data  is  more  precise  than  the  mean 
of  a  simple  random  sample  if  and  only  if 


ti  k 


or 


n  k 


Since  N/n^^;k  one  obtains  the  following  condition. 
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2  1 
®  ^k(n-l) 


(13) 


n 


Z  I 

i=l  j=l 


This  result  implies  that  systematic  reduction  has  more  precision  than  simple 
random  reduction  if  the  variance  within  the  systematic  sample  is  larger  than 
the  original  variance  of  N  data*  That  is,  systematic  sampling  is  favorable 
when  the  reduced  data  are  heterogeneous  and  unfavorable  when  they  are 
homogeneous*  If  the  population  has  a  periodic  trend,  effectiveness  of  the 
method  depends  on  the  value  of  k.  The  least  favorable  case  occurs  if  k  is 
an  integral  multiple  of  the  period.  A  favorable  case  occurs  when  k  is  an 
odd  multiple  of  a  half  period.  See  Figure  2, 


Figure  2,  Periodic  Variation*  B  denotes  unfavorable  case  and 
G  denotes  favorable  case. 


In  the  case  where  the  population  values  occur  as  a  linear  trend,  the  systematic 
method  is  the  most  efficient  technique  available. 
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Example  2: 


Suppose  that  it  is  desired  to  reduce  the  20  data  points  of 
Example  1  to  size  5  by  the  systematic  reduction.  Arrange 
the  data  in  5  strata  as  follows. 


stratum 

No,  i 

data 

4  2 

r(x..-x.)^ 

z 

s, 

1 

1 

52 

55 

56 

49 

30 

10.  00 

2 

33 

56 

44 

55 

350 

116. 67 

3 

43 

40 

44 

41 

10 

3.  33 

4 

39 

45 

69 

36 

313 

104. 33 

5 

51 

44 

45 

45 

31 

10.  33 

Now  choose  a  data  point  at  random  from  the  first  four 
measurements,  say  55  (second  data  point).  Then  select 
every  fourth  data  thereafter.  Thus,  the  reduced  sample 
data  are 

55  56  40  45  44 


The  mean  and  variance  of  the  mean  of  a  reduced  data  in 
this  case  is  obtained  by  Eq.  (12) 

X  =  48.0 

Var{x)  =  (19/20)  52.46  -  ( 1 /20)(30+350+l  0+3  1  3+3 1 ) 

=  49.  84  -  36.  70  =  13,  14 


If  a  random  reduction  is  used  for  the  above  data,  one 
obtains  by  Eq.  (11) 


Var { X ) = 


(20  -  5) 
(20)  (5) 


52.46  =  7.  87 


Thus,  in  the  above  case,  the  random  reduction  gives  a 
more  efficient  result. 
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c)  Stratified  Reduction 

In  this  method  the  N  data  are  partitioned  into  n  disjoint  strata  of 
approximately  equal  size  according  to  some  characteristic  such  as  tume  or 
magnitude.  Then  one  data  point  is  selected  from  each  stratum  at  random, 

(See  Figure  2, )  If  one  suspects  any  periodic  trend  and  the  period  is  unknown, 
then  the  stratified  reduction  is  recommended  over  the  method  (b).  The  estunate  of 
ths  mean  by  the  stratified  reduction  is  given  by 


and  its  variance  is 


(14) 


(15) 


where  k  is  the  stratum  size,  and  s^  denotes  the  variance  of  the  ith 
stratum.  Note  that  k<5#  ^N/nj. 

The  set  of  elements  upon  which  the  sample  size  reduction  operation  is 
performed  is  called  a  frame  and  in  many  practical  situations  a  given  popula¬ 
tion  conceivably  contains  a  number  of  different  frames.  In  Figure  3,  a 
system  is  stratified  into  three  zones  in  two  ways. 


A 

A 

B 

C 

B 

C 

Frame  I  Frame  II 


Figure  3,  Zone  Stratifications  of  a  System 
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Consider  two  frames  of  stratification  as  shown  in  Figure  3.  Suppose  it 
is  known  that  the  between- strata  variance  of  three  subregions  A,  B,  and  C 
of  Frame  I  is  greater  than  corresponding  variance  of  Frame  II.  This  implies 
that  the  sum  of  within-strata  variance  of  the  Frame  I  is  less  than  that  of  the 
Frame  II.  Consequently,  by  Eq.  (15),  Frame  I  is  preferred  over  Frame  II. 
Thus,  the  choice  of  a  frame  is  often  an  important  aspect  of  sample  design. 

In  general,  a  good  frame  is  one  which  is  heterogeneous  between  subregions 
and  homogeneous  within  each  subzone.  Total  sample  size  is  then  allocated 
into  three  zones  according  to  the  importance  of  each  zone,  such  as  size  and 
sensitivity. 

Example  3:  Consider  again  the  data  from  Example  2,  By  stratified 

random  reduction,  one  selects  one  data  point  from  every 
stratum.  That  is,  first  point  from  (52,  55,  56,  49)  and 
second  point  from  (33,  56,  44,  59),  etc.  Thus,  the 
reduced  sample  data  might  be 

55  44  40  45  51 

The  mean  and  its  variance  of  a  reduced  data  in  this  case 
are  obtained  by  Eqs,  (14)  and  (15). 

X  =  47.  0 

Var  (x)  =[(20-5)/2o]  (10.0+116.67+3.33  +  104.33  +  10.33) 

=  7.34 

Thus,  stratified  reduction  method  yields  the  most  efficient 
estimate  in  the  above  example. 
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4.  AUTOCORRELATION  FUNCTION  ESTIMATES 


Certain  arbitrary  quantities  must  be  decided  upon  for  autocorrelation 
function  estimates.  First,  an  acceptable  percentage  normalized  standard 
error  «  referred  to  the  value  of  R(t)  at  t  =  0  (the  mean  square  value)  must 
be  established.  Second,  the  bandwidth  of  the  signal  being  analyzed  must  be 
known  or  estimated. 

It  can  be  shown,  that  under  the  assumption  of  a  Gaussian  process,  the 
normalized  standard  error  e  =  c(0)  is 


1 


(16) 


where  T  is  the  record  length  used  in  the  analysis,  and  B  is  the  signal  band¬ 
width  appropriately  defined.  For  Eq.  (l6)to  theoretically  hold  true,  the 
process  x(t}  should  have  a  flat  spectrum  B  cps  wide  with  a  perfectly  sharp 
cutoff.  In  practice,  B  is  much  more  difficult  to  define.  For  experiment 
planning  purposes,  one  can  only  be  careful  to  estimate  the  bandwidth  B 
conservatively  too  small.  When  an  experiment  is  completed  or  one  has 
other  reasons  to  know  the  shape  of  the  spectrum,  other  problems  arise. 

For  example,  suppose  the  spectrum  of  x{t)  has  the  shape  indicated  in  the 
sketch  below. 


(f) 
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In  such  a  situation,  probably  one  should  choose  as  B  the  sum  o£  the  half- 
power  point  bandwidths  of  each  peak.  Similar  judgments  must  be  made  in 
other  complicated  situations. 

From  Eq.  (16),  it  is  straightforward  to  compute  a  required  record 
length  T.  For  example,  suppose  it  is  desired  to  maintain  e  =  10%  and 
B  is  known  to  be  2000  cps,  then 


T  = 


Be 


2  •  10^  X  lO'^ 


=  . 05  sec. 


If  one  collects  N  independent  discrete  observations,  then  the  variance 
of  the  autocorrelation  estimate  is  (see  Reference  3,  p.  358), 


Var 


^^(0)  +  ^^(t) 
N 


(17) 


Note  that  this  expression  depends  on  the  true  (unknown  in  general)  auto¬ 
correlation  function  of  the  process  being  analyzed.  The  normalized 

standard  error  is 


e(T)  =1^ 


N 


(18) 


From  this  equation,  if  is  known,  one  can  obtain  the  necessary  sample 

size  for  any  point  on  the  correlation  function. 

Certain  problems  arise  in  deciding  upon  the  necessary  accuracy  for 
a  correlation  estimate.  For  example,  a  typical  correlation  function  has  the 
form  illustrated  in  Figure  4  below. 
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Figure  4  .  Typical  Autocorrelation  Function 


A  percentage -wise  accurate  estimate  of  R(t)  at  one  of  the  points  where 

R(t)  is  near  zero  would  require  an  inordinately  large  sample  size  N.  For 

this  reason,  it  is  not  feasible  to  select  sample  sizes  which  will  maintain 

small  percentage  of  reading  errors  for  all  values  of  Rtr).  A  "percent  of 

full  scale"  type  error  is  a  more  reasonable  quantity  for  this  application. 

That  is,  the  value  of  R(t)  at  t  =  0  (the  maximum  value  of  R(t)  )  should 

dictate  sample  size  requirements  for  estimating  the  entire  correlation 

2  2 

function.  Note  that  in  Eq.{17)  R  (u)  is  bounded  above  by  R  (0)  and 
below  by  zero  so  that  the  maximum  variability  takes  place  at  R^(0),  In 
this  sense«  basing  sample  size  requirements  for  entirely  on  the 

point  T  =  0  is  conservative  and  proper.  If  one  employs  the  often  used 
relation  N  -  2BT  for  relating  continuous  and  discrete  samples,  then 
Eq,  (l8)  will  reduce  to  Eq,  (17) at  t  =  0,  namely. 


€ 


(19) 


The  relation  N  =  2BT  gives  the  degrees-of-freedom  in  a  signal  x(t)  with 
a  flat  spectrum  of  width  B  with  a  perfectly  sharp  cutoff*  Degrees -of- freedom 
in  this  case  means  the  number  of  independent  points  which  uniquely  determine 
x(t)*  For  this  special  situation,  degrees -of-freedom  is  equivalent  to  the 
sample  size  N  {i*  e.  ,  N  independent  observations).  Additional  discussion 
of  this  point  is  given  in  Section  4  concerning  power  spectrum  estimates. 
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When  one  is  considering  discrete  observations  which  are  statistically 
correlated,  modifications  have  to  be  made  to  Eq,  (19).  In  terms  of  the 
true  autocorrelation  function,  the  expression  for  «  becomes 


(20) 


If  one  assumes  the  easily  analyzed  case  of  an  exponential  correlation  function 
which  occurs  physically  in  the  case  of  lowpass  R-C  filtered  noise,  then 
Eq.  (  20)  becomes  approximately 


(21) 


where 


R  (rh)  =  R 

X  X 


(22) 


For  Eq,  (22)  to  apply,  the  requirements  N  >  100,  bh  >  0,  01  should  be  met. 
This  is  only  an  approximation  for  other  physically  occurring  situations,  but 
should  usually  be  conservative  and  quite  useful.  In  Figure  5,  several  curves 
are  drawn  for  various  values  of  bh  from  which  one  can  obtain  N  as  a 
function  of  e  or  vice  versa.  The  parameter  b  in  Eq.  (21)  and  Eq.  (22)  is 
the  noise  bandwidth  of  the  process  x(t).  For  example,  assume  x(t)  is 
sampled  at  an  interval  h  =  0.01  sec.  apart  and  that  the  noise  bandwidth  is 
b  =  60  cps  so  that  bh  =  0.60.  Further  assume  one  wants  to  maintain  c  =5.0%  . 
Then,  by  inspecting  Figure  5,  one  notes  that  a  sample  size  N  =  1490  is 
necessary.  Note  that  this  implies  a  record  length  of  T  =  Nh  =  14.9  sec. 

This  compares  with  N=  800,  T=  8.0  sec.  required  for  the  case  of  independent 
samples  which  is  given  by  the  bottom  curve  for  bh  =  oo. 
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Figure  5,  Standard  Error  for  Correlated  Products 
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5-  POWER  SPECTRUM  ESTIMATES 


As  with  the  case  of  the  correlation  function,  certain  parameters  must 
be  specified  in  advance.  These  are  the  resolution  bandwidth  B  of  the 
analysis  and  the  normalized  standard  error  €  .  The  bandwidth  B  here  should 
not  be  confused  with  the  signal  bandwidth-  Depending  on  whether  the  analysis 
is  to  be  performed  with  analog  or  digital  methods,  slightly  different  procedures 
are  used. 

Assume  that  x(t)  is  a  Gaussian  process  with  a  true  power  spectrum 

G(f)  which  is  approximately  constant  within  the  resolution  bandwidth  B- 

^  2 

Then  it  may  be  shown  that  an  estimate  G{f)  follows  a  x  distribution  with 
k  degrees- of-freedom  given  by 


G{f)-- 


G(f)X^ 

k 


(23) 


Here  it  is  assumed  that  m  points  of  G(f)  B  cps  apart  are  computed  so  that 
N  =  mk.  The  symbol  is  to  be  read  "distributed  as."  The  variance  of 

this  quantity  is  then  obtained  directly  as 


Var 


(24) 


The  normalized  standard  error  is 


€  = 


V  Var[ 


G(f) 


G(f). 


=VF= 


(25) 


using  k  =  2BT  with  a  proper  interpretation  here  for  B. 

To  illustrate  £q,  (25),  if  it  is  desired  to  maintain  «  at,  say,  10%, 
and  B  =  10  cps,  the  required  record  length  is 


T  = 


=  10  sec. 


Be  lO(.Ol) 


24 


and  the  required  number  of  degrees -of-freedom  in  each  individual  point  of 
G(f)  is  k  =  2BT  =  200. 

A  proper  definition  of  the  analysis  bandwidth  has  not  yet  been  given. 

It  is  actually  a  difficult  problem  to  specify  a  proper  correspondence  between 
the  degrees -of-freedom  N  and  the  BT  product.  In  Reference  4  an  ^^equiva¬ 
lent  bandwidth^*  is  given  as 


B  - 
e 


(26) 


where  is  the  power  spectrum  resulting  from  the  process  after  it  has 

been  filtered  by  the  analyzer  filter.  For  practical  purposes,  either  the  noise 
bandwidth  or  half -power  point  bandwidth  of  the  analyzer  filter  may  be  used 
instead  of  due  to  the  sharp  cutoffs  on  modern  spectrum  analyzer  filters. 
The  details  of  these  considerations  are  discussed  in  Reference  5, 

Therefore,  in  the  case  of  analog  power  spectra  computations,  the  half¬ 
power  point  bandwidth  may  be  used  in  Eq.  {25)  for  computing  required  record 
lengths.  This,  of  course,  assumes  the  filter  bandwidth  to  be  smaller  than  the 
signal  bandwidth  which  must  be  the  case  for  a  proper  analysis. 

In  the  digital  case,  the  analysis  resolution  bandwidth  B  is  determined 
by  the  sampling  interval  At  and  the  number  of  points,  m,  computed  for  the 
correlation  function.  In  fact,  for  practical  purposes,  B  is  given  by 


B  = 


_ 1_ 

mAt 


(27) 


As  indicated  in  Reference  4,  page  36,  choosing  B  in  this  manner  is  not  quite 
theoretically  correct  for  actual  filters  but  is  satisfactory  for  almost  all 
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practical  purposes.  This  relation  would  be  right  if  the  filters  effectively 
occurring  in  a  usual  digital  computation  had  perfectly  sharp  cutoffs.  In 
practice,  they  appear  as  indicated  in  the  sketch  below.  The  overlap  of  the 
filters  creates  a  small  amount  of  correlation  in  neighboring  points  of  the 
spectrum  estimate.  This  causes  a  slight  inaccuracy  in  Eq,{17), 


Ideal  Filters  for  Digital 
Spectrum  Estimates 


Actual  Filters  for  Digital 
Spectrum  Estimates 


A  procedure  for  selecting  the  sample  size  and  record  length  for  a 

digital  analysis  is  as  follows*  Assume  frequencies  up  to  a  cutoff  frequency, 

f  =  2000  cps,  are  of  interest  and  that  a  resolution  bandwidth  of  B  =  10  cps 
c 

has  been  chosen.  The  maximum  lag  number  for  the  correlation  function  is 
then 

2f 

m  ^  - —  =  400 

B 

To  avoid  aliasing  below  2000  cps,  the  sampling  frequency  must  be  twice  the 
frequency  of  interest,  which  accounts  for  the  2  in  Eq,  (28),  This  gives 
a  sampling  interval  of 

c 


(28) 
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Now  if  it  is  decided  to  maintain  a  normali2sed  standard  error  €  =  10%,  the 
required  record  length  is 


T  =• 


I 


10(  .01) 


=  10  sec. 


The  total  number  of  observations  required  is 


10 

. 00025 


40,  000 


The  final  calculations  will  then  give  200  points  of  G(f)  at  10  cps  intervals 
each  haying  200  d.  f . 

There  is  a  point  to  note  when  comparing  the  power  spectra  and  correla¬ 
tion  function  estimates.  When  the  assumption  of  a  constant  spectrum  is 
approximately  fulfilled,  then  one  obtains  independent  estimates  of  G{f)  while 
the  point  estimates  for  R(t)  are  not  independent  but  correlated.  Therefore, 
the  normalized  standard  error  requirements  only  apply  to  any  single  given  point 
of  R{t)  at  a  time.  However,  the  limits  for  G{f)  apply  to  all  computed  points 
simultaneously.  That  is,  one  could  draw  the  ^  10%  confidence  bounds  about 
G(f)  as  a  whole  but  not  for  R(t)  as  a  whole.  However,  the  basic  considera¬ 
tions  for  estimating  sample  sizes  are  not  affected. 

An  additional  point  that  one  should  realize  is  that  specifying  e  to  be, 
say,  10%  only  means  that  about  68%  of  all  the  estimates  obtained  would  be 
within  ^  10%  of  the  true  value  (assuming  the  estimates  are  normally  distributed). 
Also,  with  this  assumption,  about  95%  of  the  time,  estimates  will  be  within 
20%  of  the  true  value.  If  one  wants  the  estimates  to  be,  say,  within  ^p% 
of  the  true  value  95%  of  the  time,  then  one  must  choose  e  ={p/2)%.  Of  course,  if 
one  draws  a  +  1e  =  10%  confidence  band  for  200  points  in  a  power  spectrum  esti¬ 
mate,  one  would  expect  68%  of  the  estimates  to  be  within  10%  of  the  true  value. 
This  means  32%  of  200  or  64  true  points  would  be  expected  to  lie  outside  the 
bands.  One  must  adjust  probabilities  appropriately  if  it  is  desired  for  no  true 
value  to  lie  outside  the  confidence  band  with  a  given  probability. 
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6.  CROSS -COR RELATION  ESTIMATES 


Let  R  (t)  and  R  (t)  be  the  autocorrelation  functions  of  x(t)  and  y(t) 
X  y 


and  let  be  the  cross-correlation  between  x(t)  and  y(t). 

variance  of  R  (t)  is 
xy 


Then  the 


,  R  {0)R  (0)  +  R^  (t) 


N 


(29) 


Equation  (29)  is  a  direct  generalization  of  Eq.(17)  and  is  the  formula  for  the 

variance  when  the  processes  are  jointly  Gaussian  and  the  estimate  of  R  (t) 

xy 

is  based  on  N  independent  observations.  The  definition  of  the  normalized 

standard  error  for  R  (t)  is 

xy 


€ 


(30) 


In  this  case  it  is  not  convenient  to  talk  about  the  normalized  standard 

error  at  t  =  0.  The  cross-correlation  function  R  (t)  does  not  necessarily 

xy  ’ 

have  a  maximum  at  x  =  0  as  do  R  (x)  and  R  (x).  However,  one  can 

X  y 

show  that  the  cross -cor  relation  function  is  bounded  by  the  product  of  the  zero 
values  of  the  two  autocorrelation  functions,  namely. 


R^  (x)  <  R  (0)R  (0) 
xy  ~  X  y 


(31) 


Therefore,  if  the  cross-correlation  function  takes  on  a  value  close  to 
the  maximum  possible  value,  then  it  makes  sense  to  employ  the  error 
formula 

•=VI’ 


(32) 
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This  is  justified  since  if 


(t)  is  close  to  R  (0)  R  (0),  then 
xy  X  y 


R  (0)R  (0) 

X  y 


R 


1 


and  Eq.  (30)  reduces  to  Eq.  (32). 

If  one  uses  this  relation  for  sample  length  requirements,  a  safety  factor 
should  be  inserted  since  positive  correlation  in  the  observations  will  tend  to 
reduce  the  effective  sample  size  N.  The  relation  N  =  2BT  then  becomes 
less  and  less  applicable.  This  is  demonstrated  in  the  graph  of  Figure  5 
since  increasing  values  of  bh  indicate  larger  and  larger  correlations  of 
nearby  sample  points.  No  such  convenient  analytical  guideline  as  Figure  5  is 
available  for  the  cross-correlation  case  however  since  the  forms  of  cross- 
c  or  relation  functions  are  not  so  conveniently  classified. 

A  tacit  assumption  is  made  throughout  this  discussion  that  a  common 
bandwidth  B  exists  for  the  two  signals.  This,  of  course,  is  not  necessarily 
true  for  practical  applications.  For  experiment  planning  purposes  a  con¬ 
servative  choice  should  be  made. 

As  an  example,  assume  two  signals  x{t)  and  y(t)  are  to  be  cross 

correlated.  If  their  bandwidths  are  estimated  to  be  -  2000  cps  and 

B  =  1000  cps,  choose  B  =  1000  cps.  Now,  if  a  substantial  peak  is  expected 
2 

such  as  is  the  case  when  one  signal  is  a  time  delayed  version  of  another,  then 
the  relation 


may  be  reasonably  employed.  This  quantity  represents  a  "percent  of  full 

scale"  error  now  since  the  peak  value  of  R  (t)  should  be  nearly  as  large 

xy 

as  the  product  R^(0)  R^(0).  If  «  =  10%  is  the  desired  error,  then  for 
B  =  1000  cps,  the  required  record  lengths  are 

T  =  1  /e  ^B  =  0.1  sec. 
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7.  CROSS-SPECTRUM  ESTIMATES 


The  cross- spectral  density  function  is  of  major  interest  for  the  determina¬ 
tion  of  frequency  response  functions  of  linear  systems-  Therefore,  the  discussion 
in  the  next  Section  8  implicitly  covers  the  most  important  aspects  of  cross- 
spectriim  estimates-  About  the  only  time  that  the  cross -spectrum  would  be  of 
interest  for  its  own  sake  is  in  determining  a  phase  relation  between  two  records- 
This,  of  course,  would  be  equivalent  to  estimating  a  time  delay  between  the 
two  records  in  which  case  the  cross -correlation  function  would  be  employed- 
It  is  fortunate  that  the  cross  spectrum  is  not  usually  of  direct  interest 
itself-  No  convenient  formulas  exist  for  the  variances  and  the  sampling  distri¬ 
butions  are  very  complicated-  However,  the  variance  of  the  co- spectrum  and 
quad-spectrum  are  bounded  by  (see  Reference  6) 


Var[^^(f)] 

G  (f)G  (f) 

X  y 

-  BT 

G  (f)G  (f) 

Var 

.  X  y 

-  BT 

In  the  above  equations 


G  (f) 
xy 


(33) 


(34) 


where  C  (f)  is  the  real  part  (co- spectrum)  of  G  (f)  and  Q  (^)  the 
xy  ^ 

imaginary  part  (quad- spectrum)  of  G^{f). 

One  encounters  problems  similar  to  that  of  cross  correlation  in  trying 
to  transform  the  quantity  of  Eq.  (33)  to  a  normalized  standard  error  with 
respect  to  I  G  (f)|  ^  From  basic  theory  it  is  known  that 


(f)  +  = 


Ig  (f)l^  ^  G  (f)G  (f) 
I  xy  '  X  y 


(35) 
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but  one  does  not  know  how  much  less  Ig  is  than  G  (f)  G  (f)<  Therefore, 

2  2^2  ^  y 

in  trying  to  divide  by  G  (f)  ,  C  (f),  or  Q  (t)  one  cannot  make  any 

Statement  about  the  magnitude  of  the  resulting  normalized  standard  error. 

That  is,  if  one  defines 


G  (f)G  (f) 

2 


G  (f) 

xy 


1 


BT 


(36) 


then  an  error  formula  of  the  usual  form  (1  /  y  BT  )  can  be  employed  only 

2 


if  the  assumption  that  G  (f)G  {f);^^  IG  (f)|~  .  This,  at  best,  is  most  likely 

X  y  '  xy  ' 

somewhat  questionable.  In  fact,  this  is  strictly  true  only  in  the  case  where 
one  has  a  linear  system  relating  x{t)  and  y{t)  and  there  is  no  extraneous 
noise  affecting  the  measurement  of  these  quantities.  Therefore,  it  is 
recommended  that  cross  correlation  or  frequency  response  function  error 
formulas  be  used  for  experiment  planning  purposes. 
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8,  FREQUENCY  RESPONSE  FUNCTION  ESTIMATES 


Sampling  variability  for  frequency  response  functions  is  more  complicated 
than  the  previous  functions.  The  frequency  response  is  a  complex- valued  quan¬ 
tity  which  can  be  described  in  terms  of  a  gain  factor  and  a  phase  factor.  The 
errors  in  gain  factor  estimates  and  phase  factor  estimates  as  a  function  of 
record  length  {sample  size)  are  treated  in  Reference  7  and  will  be  discussed 
here. 

The  frequency  response  function  characterizes  a  linear  system.  If  one 
knows  the  weighting  function  h(t)  for  a  constant  parameter,  time -invariant 
linear  system,  which  is  the  response  to  a  unit  impulse  input,  then  the 
frequency  response  function  H{f)  is  given  as  the  Fourier  transform  of  h{t). 

In  equation  form. 


{37) 


Also,  H(f)  is  a  complex  number  in  general  and  may  be  written  in  exponen¬ 
tial  form. 


H(f)  =  I  H(f)|  e-****^^^ 


(38) 


where 


where  ,  |  H{f)  |  is  the  gain  factor  and  is  the  phase  factor  of  the 

linear  system. 


When  the  frequency  response  function  H(f)  is  the  end  result  of  interest, 
a  formula  developed  in  Reference  7  gives  error  in  the  gain  factor  and 

phase  factor  ^{f)  of  H(f)  as  a  function  of  the  true  coherence  function 
and  degrees-of-freedom  k. 


The  frequency  response  function  H(f)  is  related  to  the  input  power 
spectrum  G^{f)  and  to  the  cross  spectrum  formula 


{39) 


3Z 


The  coherence  function  is  also  directly  related  to  the  input  power  spectrum 

G  (f),  the  output  power  spectrum  G  (f),  and  to  the  cross  spectrum  G  (f)  by 
X  y  xy 


(f) 


G  (f)G  (f) 
X  y 


(40) 


The  coherence  function  gives  the  degree  of  linear  relationship  (correlation) 
as  a  function  of  frequency,  between  the  input  x(t)  and  the  output  y(t). 

The  error  formula  for  frequency  response  function  measurements  is 


P  =  Prob 


^  1  - 


l^f)  -  H(f) 


H(f) 


1  -  Y  (f) 

J3L 


<  sin  6 
“|k/2 


I  A  , 

and  I  <t)(f)  -  4>(f)  I  <  6 


2  /n  2  - 

Y  (f)  COS  0  J 

xy 


(41) 


where  k  =  2BT  degrees-of-freedom  for  the  espectral  estimates.  For  small 
values  of  6,  sin  6  sw  6  so  that  both  inequalities  hold  for  the  same  numerical 
values.  One  applies  the  above  formula  by  solving  for  k.  Thus, 


k  log 


1 


(f) 

'xy 


,  2  2 
1  -  Y  (i)  cos 
’xy 


1  - 


.  2  ,,,  2 
1  -  Y  (i)  cos 
'xy 


=  1  -  P 


=  2  log  (1  -  P) 


2  log  (1  -P) 


1  -  1 

log 

xy 

2  2  , 

1  -  Y  (f)  cos  6  J 

(42) 
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To  apply  Eq,  {^2),  one  chooses  a  value  for  6,  say  10%  =  -10,  and  a 

Z 

value  for  P,  say  P  =  ,90,  Assume  for  the  moment  v  (£)  is  known  to  be 

xy 

*  90-  Now,  a  value  for  k  is  calculated,  in  this  case  kaj  53,  which  will 

maintain  the  sample  gain  factor  |  H(f)  |  within  10%  of  the  true  gain  factor, 

and  the  sample  phase  ${f)  within  ,  10  radians  of  the  true  phase  for  approxi- 

2 

mately  90  out  of  100  experiments-  Different  values  for  lead  to  different 

k*  This  formula  applies  to  one  value  of  jH(f)[  and  <|>(f)*  One  needs  a  total 

sample  of  N  =  mk  for  m  points  of  the  frequency  response  function, 

2 

The  choice  of  a  value  in  advance  for  y  (f)  is  strictly  a  matter  of 

judgment  if  prior  data  is  not  available.  From  basic  considerations, 

2 

0  <  ■V^(f)  ^  analogous  to  the  bounds  on  correlation  coefficient.  For 
purposes  of  planning  an  experiment,  one  must  make  a  judgment  based  on  the 
degree  of  linearity  believed  to  exist  and  the  amount  of  extraneous  noise 
affecting  the  measurements.  Both  of  these  factors  will  reduce  the  coherence 
of  the  system  from  a  theoretical  maximum  value  of  unity  ,  Also  note  that 
coherence  is  a  function  of  frequency  so  one  must  either  restrict  the  range  of 
frequency  for  which  the  computed  k  will  apply  or  one  must  estimate  a  worst 
case  in  order  to  be  conservative. 

For  convenience,  several  curves  have  been  plotted  giving  k  as  a 
function  of  Three  sets  of  these  curves  are  plotted  corresponding  to 

5  =  .  05,  -  LO,  and  ,  15-  In  each  set  the  curves  correspond  to  P  -  -  80,  .  85, 
and  -  90*  These  curves  are  displayed  in  Figure  6* 

In  converting  k  or  N  to  a  record  length,  the  same  considerations  as 
for  the  ordinary  power  spectral  density  function  apply.  One  does  not  have 
the  problems  associated  with  the  cross -correlation  function  since  in  com¬ 
puting  power  spectra,  the  process  is  filtered  by  the  analyzer  (or  effectively 
so  in  the  case  of  digital  methods)  and  the  filter  bandwidth  is  employed  in 
the  relation  K  =  2BT.  In  this  situation  one  has  the  k  required  for  a  given 
accuracy  of  one  estimate  for  a  fairly  narrow  bandwidth  B,  where  B  is 


34 


number  of  d.  f.  (sample  size)  of  estimate  ,  k  »  (k  =  2BT) 


0  0.1  0.  2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1.0 


coherence  function,  y  - ^ 


Figure  6.  Data  for  Frequency  Response  Function  Measurement  Confidence 
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the  analysis  bandwidth.  Therefore,  since  B  is  small,  T  must  be  relatively 
large.  For  the  example  where  k  =  53  and  B  =  10  cps, 


T 


k 

2B 


20 


2,65  sec 


Another  formula  exists  from  which  one  may  obtain  confidence  bands 
which  are  a  function  of  the  sample  quantities  obtained  after  the  experiment  has 
been  performed.  This  is  to  be  contrasted  with  the  previous  formula  which  is 
useful  for  planning  where  the  true  coherence  must  be  estimated  in  advance. 
The  (1  -  O')  confidence  limits  for  gain  and  phase  are  given  by 


where 


and 


lH(f)|  -‘f(f)  <  <  |H{f)|  +^(f) 

${f)  -  A${f}  <  «}.(f)  <  '$(£)  +  A$(f) 


r(f)  = 


BT-1 


(2,  2BT-2) 


S^(f) 


A 

A4.(f)  = 


Arc  sin 


(43) 

(44) 

(45) 

(46) 


In  Eq.  (45),  ^{2,  2BT-2)  is  the  (1  -«)  percentile  of  the  standard  F 

distribution  with  degrees-of-freedom,  n^  =  2  and  n^  =  (2BT-2).  Hence, 
Eos.  (43)  and  (44)  give  bounds  that  include  the  true  gain  ]  H(f)  |  and  true 
phase  <l){f)  with  confidence  (1  -tt).  Note  that  all  quantities  involved  in  the 
relations  are  sample  values.  These  formulas  are  all  special  cases  of  the 
general  equations  found  in  Reference  8. 
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To  illustrate  these  formulas,  suppose  the  following  values  are  obtained 
from  a  frequency  response  function  estimation  experiment. 


y  0 

y  = 

xy  0 


B  = 
T  = 


a  = 

F  ,5(2,  18)  = 


0. 20  g  / cps 
0.  10  g  / cps 

0.  80 

0.  40 

■n/4  radians  =  45° 

1 0  cps 
1  sec 
.  05 

7.  21  (see  Reference  1  tables  for  example) 


Note  that  in  these  hypothetical  values,  the  square  of  the  gain  factor  does  not 
equal  the  ratio  of  the  output  to  the  input  spectra.  This  might  happen  in 
practice  as  a  result  of  the  effects  of  extraneous  noise  or  nonlinearities. 
From  Eqs.  (45)  and  (46),  the  following  values  are  obtained. 


A„  ,  r  1  ■■  (1  -0.80)(0.10)]‘  _ 

A$(f.)  =  Arc  sin  =  26^37' 

0  .00^ 


283 


There,  95%confidence  intervals  corresponding  to  Eqs.  (43)  and  (44)  are: 

.  63  -  .  28  <  iH(fQ)  I  <  .  63  +  .  28 

45°  -  26°37'  <  <j)(fQ)  <  45°  +  26°  37' 
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9.  PROBABILITY  DENSITY  ESTIMATES 


Considerable  experimental  and  theoretical  work  is  still  in  the  process 
of  being  performed  to  develop  proper  error  formulas  for  probability  density 
estimates.  Experiments  have  been  conducted  in  the  past  and  are  described  in 
Reference  9.  The  use  of  the  error  formula  developed  experimentally  in  that 
report  is  recommended  at  the  present  time.  This  formula  will  now  be 
presented  along  with  its  limitations. 

If  one  neglects  certain  bias  terms  which  are  unimportant  in  usual 
applications,  then  theory  predicts  a  variance  for  probability  density  estimates 
of 


(47) 


In  Eq.  (47),  B  is  the  bandwidth  of  the  process  where  a  perfectly  sharp  cutoff 
in  the  spectrum  is  assumed.  Also,  T  is  record  length,  p(x)  is  the  true 
value  of  the  probability  density,  Ax  is  the  amplitude  **window**  or  resolution 
of  the  measurement,  and  N  is  the  number  of  independent  observations 
(sample  size)  used  in  the  estimate. 

The  requirement  of  independent  samples  is  not  fulfilled  in  existing 
analog  measurements  nor  is  it  necessarily  in  digital  procedures.  Experiments 
were  performed  (Reference  9  )  which  indicate  that  this  is  a  significant  factor. 
The  results  of  those  experiments  indicate  that  a  usable  error  formula  is 


(48) 


It  must  be  emphasized  that  the  above  equation  was  developed  only  for 
one  specific  instrument  and  is  possibly  valid  for  only  that  instrument.  For 
example,  if  a  process  was  digitally  sampled,  and  the  observations  were  un¬ 
correlated,  then,  if  the  density  function  was  calculated  on  a  digital  computer, 
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Eq.  (47)  rather  than  Eq.  (48)  applies.  Experiments  are  presently  being 
designed  so  that  error  formulas  similar  to  Eq.  (48)  may  be  developed  for 
other  specific  instruments. 

In  Eq.  (48),  p(x)  would  be  used  rather  than  the  true  value  p(x)  if 
one  was  establishing  limits  about  a  measurement  of  p(x).  Also,  in  practice, 
noise  bandwidth  or  half-power  bandwidth  can  be  used  for  B.  As  mentioned 
above  and  described  in  Reference  9,  the  above  formula  was  obtained  with 
only  one  specific  instrument  and  only  approximately  Gaussian  signals  were 
analyzed.  Therefore,  as  with  most  other  error  formulas  in  existence, 
one  must  be  prudent  in  its  application  when  the  underlying  assumptions  (such 
as  non-Gaussian  noise  and  different  instruments)  are  not  satisfied.  However, 
Eq.  (48)  represents  the  best  available  result  and  does  provide  one  with 
reasonable  guidelines  for  experiment  planning  purposes. 

As  an  example  of  the  application  of  Eq.  (48),  assume  one  has  a  signal 
x(t)  with  a  flat  spectrum  out  to  B  =  2000  cps.  Further  assume  an  error 
€  =  1%  is  desired  for  a  point  one  standard  deviation  (l.Ocr)  away  from  the 
mean  and  that  the  resolution  is  to  be  Ax  =  O.lo".  For  planning  purposes, 
suppose  one  expects  a  near  normal  density  function.  Then  one  obtains 
p(l.0(r)  =  .242  from  tables  of  the  normal  density  function.  The  required 
record  length  for  the  experiment  then  is 


T  = 


.  04 


B  p(x)  Ax€ 


2 


_ ^^04 _ 

2000(.  242)(.  1)(.  0001) 


8. 26  sec 


(49) 


For  expected  density  functions  other  than  Gaussian,  one  substitutes  the 
appropriate  value  for  p(x).  Also,  it  will  be  most  convenient  to  work  in  terms 
of  standardized  units,  e.  g.  ,  x/cr,  rather  than  any  absolute  terms. 

The  final  fact  to  be  emphasized  is  that  Eq.  (47)  or  Eq.  (48)  apply  only 
to  a  given  single  point  selected  in  advance  on  the  probability  density  function. 
The  correlation  from  one  point  estimate  to  the  next  is  not  known  and  one 
cannot  draw  confidence  bands  about  the  entire  curve  simultaneously.  One  can 
only  do  this  for  one  given  individual  point. 
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10.  THE  WEIBULL  DISTRIBUTION  FOR  FATIGUE  TESTS 


When  constructing  a  statistical  model  for  life  length  or  fatigue  failure 
rate  of  a  structure  one  often  finds  that  an  assumption  of  normality  is  not 
satisfied.  For  example,  many  life  length  distributions  are  markedly  skewed. 
The  instantaneous  failure  rate  or  so-called  hazard  rate  (see  Section  10.  1  for 
the  definition)  of  the  normal  distribution  is  a  strictly  increasing  linear  func¬ 
tion  of  time  which  is  not  desirable  in  many  structural  fatigue  models. 

In  order  to  describe  the  random  behavior  of  fatigue  life,  a  number  of 
probability  distributions  have  been  proposed.  Among  these,  the  exponential 
distribution  is  best  known  and  most  widely  used  in  electronic,  chemical,  and 
other  application  ai^as. 

The  exponential  distribution  has  a  number  of  desirable  statistical 
properties,  but  its  usefulness  is  limited  because  of  the  following  property: 

If  the  life  length  T  of  a  structure  has  an  exponential  distribution,  then 
previous  use  does  not  affect  its  future  life  length.  This  fact  is  easily  seen 
in  the  following  relation. 

Let  T  be  the  random  variable  distributed  with  an  exponential  proba¬ 
bility  density. 

,,  ,  1  -t! a 

f(t)  =  —  e  if  t  >  0 

=  0  if  t  <  0 

Then 

P(T  >a  +  b|T>b)  =  >  a  +  b,  T  >  b) 

'  P(T  >  b) 

Now,  if  T  >  a  +  b,  then  it  is  simultaneously  larger  than  b,  hence 

P(T  >  a+ b,  T  >  b) 

P(T  >  b) 


P(T  >  a  +  b) 
P(T  >  b) 


■(a+b)/® 
-a/ a 


=  e 


-b/y 


=  P(T  >  b) 
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In  words,  given  that  a  structure  has  lasted  for  b  or  more  units  of  time, 
then  probability  of  lasting  "a"  or  more  additional  units  of  time  is  the  same 
as  the  probability  of  a  new  unit  lasting  a  +  b  or  more  units  of  time.  In 
short,  a  characteristic  of  the  exponential  distribution  is  a  constant  failure 
rate  when  it  is  used  as  a  failure  rate  distribution.  That  is,  the  average 
number  of  failures  which  occur  in  a  unit  time  period  remains  constant  with 
time.  Thus,  if  a  structure  has  a  constant  failure  rate  the  exponential 
distribution  is  a  tailormade  model. 

In  general,  the  distribution  of  life  length  for  some  object  has  a  unique 
shape,  scale,  and  location.  For  example,  any  particular  structure  has  a 
unique  failure  rate.  The  Weibull  distribution  has  three  parameters  which 
determine  shape,  scale,  and  location.  Thus,  the  life  length  of  many 
structures  can  be  suitably  modeled  by  determining  each  of  three  parameters. 
Among  others,  one  of  the  advantages  of  using  the  Weibull  distribution  is  to 
be  able  to  model  so  many  different  fatigue  life  distributions;  it  can  be 
exponential,  Rayleigh  or  approximately  normal.  The  following  sections 
present  some  of  the  statistical  properties  and  applications  of  the  Weibull 
distribution. 
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10.  1  DEFINITION  OF  THE  WEIBULL  DISTRIBUTION 
The  general  form  of  the  Weibull  distribution  is: 


W(t)  = 


if  t  >  Y 
if  t  <  Y 


and  the  density  function  is 


w(t)  = 


(t  -  y) 


P-1 


-[(t-Y)/J^ 


if  t  >  Y 


if  t  <  Y 


(50) 


(51) 


In  the  above  equations,  a,  P,  and  y  are  parameters  of  the  distribution 
generally  named  as  follows: 


or  =  scale  parameter 
P  =  shape  parameter 
Y  =  location  parameter 

(a  and  p  are  not  to  be  confused  with  level  of  significance  and  probability  of 
Type  n  Error  which  they  often  denote.)  The  parameter  q  is  analogous  to 
the  variance  of  the  normal  distribution  in  that  it*s  value  affects  the  scaling 
of  the  distribution.  The  parameter  y  is  a  location  parameter  as  is  the 
mean  of  a  normal  distribution  in  that  it  translates  the  distribution.  This 
parameter  y  naay  be  interpreted  in  life  length  testing  as  the  minimum 
length  of  time  that  passes  before  any  failure  can  occur.  The  parameter  p 
is  termed  the  shape  parameter  since  it  affects  the  basic  shape  of  the 
distribution. 
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Figure  7.  Four  Shapes  of  the  WeibuU  Density  Function 

Four  different  shapes  depending  on  p  are  illustrated  in  Figure  7. 

The  shape  parameter  p  describes  the  mode  of  failure.  Thus,  for  P  =  1 
the  failure  rate  is  constant  over  time.  P  <  1  indicates  that  the  failure  rate 
is  a  decreafsing  function  of  time,  while  for  p  >  1  the  rate  is  increasing  with 
time.  A  more  descriptive  way  of  looking  at  a  statistical  fatigue  model  is 
by  considering  the  (instantaneous)  failure  rate  or  so-called  hazard  rate. 

It  is  the  instantaneous  rate  of  change  in  failure  probability.  That  is,  the 
hazard  rate  H(t)  is  defined  by 


H(t)  =  lim 
At-*-0 


P(T  >  t)  -  P(T  >  t  +  At) 

AtP(T  >  t) 


f(t) 

P(T  >  t) 


where  f(t)  is  the  value  of  density  function  at  T  =  t.  Thus,  the  hazard  rate  is 

the  density  function  of  time  to  failure  given  that  the  system  has  not  failed  prior 
to  time  t. 
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In  the  case  of  the  Weibull  distribution,  the  hazard  rate  is 


H{t) 


(t  -  V) 


P-1 


a 


(52) 


Equation  (52)  shows  that  the  Weibull  distribution  reduces  to  the  exponential 
distribution  when  P=  1,  From  Eq.  (52),  it  is  noted  that  the  exponential 
distribution  is  associated  with  a  constant  hazard  rate  1/a.  This  fact  was 
shown  using  basic  probability  notations  in  the  introduction  to  Section  10. 

The  hazard  rates  of  the  Weibull  distribution  for  several  values  of  p 
and  with  the  other  parameters  fixed  are  illustrated  in  Figure  8. 


Figure  8.  H(t)  with  \  =  0,  a=l,  p=l,2,3,4 
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The  shape  parameter  p  may  be  interpreted  in  terms  of  the  hazard  rate 
as  follows: 


P  ^  1  an  increasing  hazard  rate 
P  =  1  a  constant  hazard  rate 
P  <  1  a  decreasing  hazard  rate 

Let  b  =  1/p.  Then  one  finds  the  following  values  of  the  first  two 
moments  and  the  median  for  the  Weibull  distribution  (Reference  10): 

(53) 

(54) 
(56) 


mean  =  |jl  =  y  +  or(b!  ) 

=  (T^  =  a^[2(bl)  -  (b!)^] 


variance 


median  =  m  =  y  +  2) 

•  3 


When  P  =  3.  57,  then  the  Weibull  distribution  becomes  a  good  approxi¬ 
mation  of  the  normal  distribution.  This  is  because 

I  4  57\  1/3.57  b 

b!  =  rl  ^  «  (.693)  =  (log  Z)*"  (56) 

and  the  mean  and  median  are  approximately  equal  when  p  =  3.57. 

Note  that  the  distribution  has  positive  skewness  if  P  <  3.57.  The 

Weibull  distribution  becomes  the  Rayleigh  distribution  (  with  2  d.  f.  ) 
when  p  =  2 . 
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10.2  WEIBUL.L  PARAMETER  ESTIMATES 

When  the  Weibull  distribution  is  assumed  as  a  statistical  model, 
the  problem  of  estimating  three  parameters  arises.  Two  methods,  maximum 
likelihood  and  minimum  chi-square  methods  are  generally  applied  to  estima¬ 
tion  problems.  It  is  not  intended  to  give  an  exhaustive  discussion  here  on 
statistical  inference  concerning  the  Weibull  distribution.  Only  a  few 
applicable  results  are  presented. 

(a)  Maximum  likelihood  estimate  (M.L.E)  of  /  when  y  and  p  are  known. 
The  likelihood  function  to  be  maximized  is 


L 


n 

n 

i=l 


(57) 


where  ,  T^  ,  .  .  .  ,  T^  are  the  sample  data  (times  to  failure).  Logarithms 
are  now  taken  which  simplify  the  solution  of  Eq.  (57): 


log  L 


[w(t)] 


(58) 


When  Eq.  (58)  is  maximized  with  respect  to  /  assuming  y  and  p  are 
known  constants,  one  finds 


Z  <’'i  - 

i=l 

n 


(59) 


(b) 


M.L.E' s  of  p  and  y.  When  Eq.  (58)  is  maximized  simultaneously  with 
respect  to  p  and  y,  one  obtains  the  two  equations 
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E  log  (T.  -  Y)  -  E  I 


=  0  (60) 


i=l 


n 


1 


0  (61) 


These  two  equations  along  with  Eq.  (59)  have  to  be  solved  to  yield  M.L.E's 
for  a,  P,  and  y.  This  cannot  be  done  explicitly  (Reference  10), but  an 
iterative  solution  by  computer  is  possible. 

(c)  A  very  useful  result  is  derived  by  Menon  (Reference  II  ).  Let 

X  =  T  -  Y.  Let  y  be  formed  of  those  negative  values  obtained  from 
i  i  i 

log  X.  -  log  a.  while  the  are  formed  of  those  positive  values 
obtained  from  log  x.  -  log  a.  That  is,  define  y^  and  as  follows. 


if  log  x^  -  log  or  0 


otherwise 


x.^  -  log  a  if  log  x^  -  log  a  '>  0 


otherwise 


Then,  when  a  and  y  known. 


n 


n 


n 


(62) 


2 


(63) 
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where  b  =  1/p.  Note  that  the  summations  on  v.  and  z.  run  to  n  since 

1  1 

n  values  for  each  of  these  variables  is  defined  even  though  some  of  the 
values  are  zero.  This  is  convenient  for  this  theoretical  work  but  for 
computational  purposes,  one  summation  would  run  from  1  to  p  and  the  other 
from  1  to  q  where  N  =  p  +  q. 

When  a  is  unknown  but  y  is  known, 


A 


Var  (b) 


1.1  b 


(65) 


where  b  =  1  /p. 


Example: 


Suppose 


log  -  5.4 


Then,  using  Eq.  (62),  one  obtains 

A  1 

^  (-.74)(-.5)  +  (1.85)(5.4)  ^ 

A 

and  using  Eq.  (63),  the  variance  of  b  is  computed  to  be 


(.66) 


.0965 

15 


=  4.  72 
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(d)  Graphical  methods  may  be  used  for  estimation.  Through  the  use  of  the 
Weibull  probability  graph  paper,  a  simple  method  is  available  for  ob¬ 
taining  estimates  of  the  parameters  ot  and  p.  This  technique  is 
discussed  in  Reference  12. 


10.3  WEIBULL  PARAMETER  CONFIIDENCE  LIMITS 

S 

(a)  Confidence  limits  for  a  when  p  and  y  are  known  are  now  given. 
Let  ^  ^  ~  1,  2,  .  .  .  ,  n  and  y  =  T  -  y.  Then 


P{y'^<  Yq)  =  P(y  <  yj^*^ 


=  1  -  e 


•y  lot 


It  is  clear  that  y  has  an  exponential  distribution  with  a  single  parameter 

0^  .  Reference  13  shows  that  Zn^/or^  is  distributed  as  chi-square  with 

2n  degrees-of-freedom  ).  Thus,  the  desired  (1  -  «)  confidence 

b  n 

interval  is  defined  by  the  equation 


/\ 

r  2  2na^  2  1 

Iy  <  <  Y  1  -  P 

zJ 

/V 

P  2na^ 

[^2n(«/2)  p  ^^2n(l-€/2)J  ^ 

2  "  “  "  2 

a 

-^2n(l-  €/2) 

^2n(€/2)  . 

=  1 


.2  2 

wnere  ^2n(l-€/2)  are  lower  and  upper  tail  e /2  percentiles 

of  the  distribution. 


Example: 


Suppose  O'  =  30  (hours),  n  =  9,  and  €  =  .  1  .  Then 


540  ^  p  540 

<  a  < 


=  P(18.70<  a'^<  57.51)  =  .  9 


28.87  9.39 

Thus,  90%  confidence  limits  for  is  (18.70,  57.51). 


(66) 
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(b)  Rather  conservative  confidence  limits  for  |3  can  be  constructed  using 

Chebyshev*s  inequality  and  Eqs*  (62)  and  (63)  or  (64) and  (65)-  Chebyshev^s 
inequality  states  that 


|b  -  b|  <  €”^Var  b  > 


or 


[  b  - 


Var  b  <  b  <  b 


+ 1  V\ 


Var^ 


>  1 


Thus,  if  a  is  known 


A 


6)<P 


A 

<|3(l 


+  6) 


>  1 


.66 


n  6 


(67) 


and  if  a  is  unknown 


p{l  -6)  <  p  <  P(1  +  6) 


>  1 


1.  1 
n  6^ 


where  6  >  0  is  a  constant. 

Example:  Suppose  one  wishes  to  construct  90%  confidence  limits  on  P 

given  the  data  of  the  example  in  Section  10.  2(c),  Let 

66 

1  — —  =  .  9,  then  6  =  .  663.  When  y  and  a  are  known,  one 

n6^ 

obtains  from  Eqs.  (61),  (62),  and  (66) 

P  [.0965(1-. 663)  <  P  <  .0965(1  +  .  663)]  =  P(.  033  <  P  <.  160)  >  .  9 
Thus,  the  90%  confidence  limits  for  P  are  (.033,  .  160). 
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10.4  HYPOTHESIS  TESTS 


(a) 


Suppose  one  wishes  to  test  the  hypothesis  H^:  0  =  0^  ,  (0  may  be  o,  p,  or  y)- 

The  likelihood  ratio  test  is  used  here  when  sample  size  n  is  large.  Let 

n 


TT  l^o' 

Jil -  (68) 

TT 


where  \v(t.  |0  )  is  the  density  function  given  in  Eq,(51)  when  0  -  0^.  Similarly, 

w(t.  |0)  denotes  the  value  of  the  density  function  when  0  =  0  where  0  is  the 
i  *  2 

M.  L.  E.  of  0.  When  n  is  fairly  large  -Zlog^X  is  approximated  by  the 

distribution.  Hence,  a  value  of  X  is  obtained  from  the  sample  data  and  -21og  X 

2 

may  be  compared  with  an  appropriate  value  of  Xj^  obtained  from  a  table. 


Example:  Suppose  one  wishes  to  test  the  hypothesis  P  =  1.5  with 

sample  size  n  =  30.  Assume  that  the  computation  by 

Eq.  (68)  yields  the  value  X  =  .11.  Then  -2  log  X  =  4.41. 

2  ^  . 

Since  the  5%  critical  value  of  is  3.84,  the  hypothesis  is 

rejected  at  95%  confidence  level. 


2 

(b)  The  X  distribution  may  be  used  for  a  Weibull  goodness  of  fit  test.  A 
criterion  for  testing  the  goodness  of  fit  of  a  Weibull  distribution  from  n 


samples  is 


I 


n 


i=l 


npi 


(69) 


where  p.  is  the  probability  that  a  failure  occurs  in  the  interval  to  t^. 

f  is  the  actual  number  of  failures  in  the  interval  t.  .  to  t  .  The  quantity  k 
i  1-11 

is  the  number  of  cells  such  that  approximately  np.  >  5  for  each  i.  If  this 

2  ^  2  .  .  . 

condition  is  satisfied  then  X  is  approximated  by  £  distribution. 
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10.5  RELIABILITY  PROBLEMS 

Suppose  one  has  several  systems  each  of  which  could  be  used  for  the 
same  purpose.  Very  often  one  has  to  choose  the  one  which  is  best  suited 
for  the  purpose.  Define  reliability  R(t)  as  the  probability  that  a  system 
will  perform  satisfactorily  for  at  least  a  given  period  of  time  t.  In  the 
case  of  the  Weibull  distribution 

[(t-Y)/ 

R(t)  =  P(T  >  t)  =  e 

A  commonly  used  decision  procedure  is  to  choose  a  system  with  the  maxi¬ 
mum  reliability  R(1q)  if  the  system  has  to  last  to  time  t^  in  case  of  non¬ 
replacement  policy.  This  policy  applies  when  the  system  is  not  replaceable 
or  one  does  not  wish  to  replace  the  system  if  it  fails.  Another  situation  is 
the  replacement  policy;  this  is  a  policy  of  immediate  replacement  when  the 
system  fails.  A  simple  decision  procedure  in  this  case  is  to  choose  a  system 
with  the  maximum  mean  life. 


Example:  Suppose  a  structure  is  characterized  by  y  =  10  hours, 

or  =  80  hours,  and  (3=2.  Then 

R(50)  =  a-  =  .779 


mean  life  =  10  +  80 


1. 

2  ■ 


=  80.  9  hours 


That  is,  the  probability  that  the  structure  will  last  at  least 
50  hours  is  .  779.  The  graph  of  R(t)  for  the  Weibull  distribu¬ 
tions  with  some  fixed  parameters  is  given  in  Figure  9. 


The  above  mentioned  decision  procedure  when  the  system  is  non-replaceable, 
does  not  account  for  the  average  reliability  or  cost  which  in  general  are 
functions  of  time.  Thus,  it  seems  that  a  more  sophisticated  technique  is 
necessary  depending  on  the  situation.  The  following  two  procedures  are 
proposed  examples. 
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R(t) 


Time 


Figure  9#  R(t)  =  e  (V  -  0  s.nd  a  -  1) 


(a)  To  guard  against  the  worst  case  select  the  system  j  associated  with  0. 
which  will  maximize 

min  <f>  {c(t)]R(t|0  )  (70) 

t 


where  <t>|^c(t)|is  a  function  of  the  cost  at  time  t.  R{t|0.)  is  a  reliability  of 

a  system  with  the  parameter  0.,  This  decision  procedure  is  an  extremely 

J 

conservative  one.  One  simply  selects  the  system  which  is  the  best  in  the 
worst  case.  (See  Figure  10) 


(b)  Suppose  one  is  interested  in  over-all  performance  during  the  time  of 

operation  t^.  In  this  case  a  better  decision  is  to  select  the  system  j 

associated  with  9.  which  will  maximize 
J 


(71) 
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That  is,  a  system  which  is  the  best  in  average  during  the  time  of  operation 
should  be  selected.  In  Figure  10,  one  clearly  prefers  system  3  over  systems 
1  and  2  by  the  above  method  (a)  which  only  considers  the  point  t^  in  this  case, 
one  is  interested  in  over-all  operation  time,  system  1  is  most  preferable. 


<!>  {c(t))  R(t(0^) 


Example . 

Assume  4>  {c(t)^=  1  for  j  =  1  or  2.  Suppose  that  t^  =  2,  y  =  0.  o  =  1.  If 
there  is  a  choice  of  two  systems  with  p  =  1  and  p  —  2,  then  since  e”^  ^  e 
the  decision  by  Eq.  (70)  will  choose  a  system  with  p  =  1  to  guard  against 
the  worst  case.  While  the  relation 


dt 


indicates  that  a  system  with  P  -  2  is  better  if  one  is  interested  in  average 
reliability  during  the  time  period  of  operation. 
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10.6  SAMPLE  SIZE 

Methods  of  determining  sample  sizes  needed  for  estimating  various 
quantities  are  proposed  here. 

o 

(a)  Suppose  a  confidence  interval  about  is  desired  whose  length  is  L. 

The  parameters  P  and  y  are  assumed  to  be  known,  (see  Section 
10.6  (c)  below).  The  sample  size  n  is  determined  such  that  n 
satisfies  the  following  two  conditions: 


2  nor 


2  nor 


<  L 


^2n(€  /2)  ^2n(l  /  2) 


and 


2  nor 


2n(  1-^/2 


A 

<  aP  < 


2nQr 


^2n{€  /  2) 


=  1  -  € 


(b)  Now  consider  a  confidence  interval  about  P  whose  desired  length  is 
<  L.  Assume  that  a  and  y  are  known.  Equation  (67)  states: 


p|^(l-6)<  p<'?(l  +  6)j>l - 


(72) 


where  6  >  0  is  a  constant.  If  the  confidence  coefficient  is  p^  ,  then 

.66  r  /  •  66 

■^--^0  ®=Y(1-P„)n 


1  - 


Thus, 


<  L 


and 


(i-Po)L 


2  - 


<  n 
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Since  the  inequality  (72)  has  two  unknowns  n  and  p  ,  n  has  to  be  deter¬ 
mined  iteratively.  That  is,  n  is  increased  until  the  inequality  (72)  is  satis¬ 
fied,  This  procedure  is  illustrated  in  the  following  example. 

Example:  Suppose  that  one  wishes  to  estimate  p  by  Eqs,  (62)  and  (63) 

within  an  accuracy  of  _+.5  with  confidence  of  90%.  One  has  to 
determine  n  from  inequality  (62)  iteratively.  Suppose  one 
guesses  n  =J16,  and  obtains  an  estimate  ^=2  .  Then 
inequality  (62)  is  not  satisfied  since 


(2.62)(4) 

(.1)(1) 


=  104.8  > 


16 


Now  suppose  a  sample  of  size  n  =  100  is  used,  and  an  esti- 
A 

mate  p  =  1.95  is  obtained.  The  inequality  (62)  is  satisfied, 
thus  100  is  a  sufficiently  large  sample  size  for  the  purpose. 


(2.62)(1.95)^ 

(.1)(1) 


=  99.  6  <  100 


(c)  A  confidence  interval  about  p  whose  expected  length  is  L  when  y 

is  known  but  a  is  unknown  will  now  be  derived.  A  similar  approach 
to  (b)  yields 


where  p^ 
given  by 


1.  1 

„6^ 


=  P, 


or 


is  the  confidence  coefficient. 


4.  39^^ 

(1  -Po)l' 


The  desired  sample  size  n  is 


<  n 


Again  the  above  inequality  has  to  be  solved  for  n  iteratively. 
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(d)  As  a  further  example  for  sample  size  computations,  it  will  be 

assumed  that  p  =  2  .  This  would  seem  to  correspond  to  a  reason¬ 
able  idea  of  the  shape  of  a  density  function  for  time  to  fatigue  failure. 
Choosing  a  specific  reasonable  value  for  the  parameter  p  and  further 
assuming  that  y  is  known  simplifies  many  of  the  parameter  estima¬ 
tion  problems.  It  has  been  noted  that  for  p  =  2,  the  Weibull  distribution 
becomes  a  \  distribution  with  two  degrees-of-freedom  if  the  scale 
parameter  is  chosen  as  or  .VI  and  the  location  parameter  y  =  0. 

This  is  also  the  Rayleigh  distribution.  The  equation  for  the  density 
function  in  this  case  is 


-x^/2 


W(x)  =  xe 


(73) 


It  can  be  shown  (Reference  l0)that  the  mean  value  of  a  random 
variable  x  having  the  distribution  W(x)  is  (when  P  =  2) 


E(x)  =  y  +  Of 


V7 


(74) 


and  the  variance  is 

Var  (x)  =  [r(2)  -  r^(|)]  =  [l  -  \ 


(75) 


where  r(n)  is  the  Gamma  function. 

To  obtain  an  estimate  of  the  sample  size  needed  to  estimate  the  mean 

of  the  distribution  (the  mean  time  to  fatigue  failure)  one  examines  estimates 

of  E(x)  of  Eq.  (74).  If  y  is  assumed  to  be  known,  then  the  maximum  likeli- 

2 

hood  estimate  of  a  is 


a2 

or 


(76) 
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where  x.  =  T.  -  v  and  T.  is  the  time  to  failure  of  the  ith  test  item.  The 
11  1 

total  number  of  test  items  is  denoted  by  N.  From  Section  10.3  the  quantity 
Zn'S^/o'^  has  a  distribution  with  2N  d.  f.  Hence,  has  the  distribution 


2 

a 


2  2 

«  X  (ZN) 

2N 


Therefore,  the  expected  value  and  variance  of 


{77) 


E(S^) 


2 

a 


(78) 


and 


Var  («^)  =  -^-5  Var[x^(2N)] 
4N 


4 

a 

- r-  4N  = 

4N 


2 

a 

"n 


(79) 


The  normalized  standard  error  t  is  then  determined  by 


2 

€ 


Var(^^) 

E^S^) 


(80) 


or 


(81) 


These  formulas  could  be  employed  to  estimate  a  directly.  However, 
since  the  unsquared  quantity  a  appears  in  Eq*  (74),  this  is  the  quantity  of 
interest. 

2 

For  large  N,  if  a  variable  x  has  a  \  (2N)  distribution,  then 
is  approximately  normal  with  mean  2N  -  1  and  variance  of  unity. 
See  page  371,  Reference  14.  The  unit  variance  in  this  transformation  is 
an  approximation  but  only  terms  of  the  form  1 /4N  are  neglected.  Hence, 
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for  even  moderate  values  of  N  the  approximation  is  good.  Using  these 
facts  one  calculates  for  the  mean  value: 


and  the  variance: 

Var|^^)  =  I  Ijj-  Var|yi7(^|  *3  (83) 

The  normalized  standard  error  €  is  then  obtained  from 


2 

€ 


2 

TT  Of 


4  4N 
~2 


TT  a 

4  4N 


{2N-  1) 


1 

2N  -  1 


(84) 


Finally,  the  normalized  standard  error  of  the  unknown  portion  of  the 
mean  time  to  fatigue  failure  is  given  by 


1 

t  =  — ■■  ■ 

V2N  -  1 


(85) 


One  may  now  use  Eq,  (85)  for  experiment  planning  purposes.  Large  sample 
sizes  should  be  expected  to  properly  employ  Eq.  (85),  but  for  the  lack  of 
better  available  techniques,  one  may  use  it  as  a  reasonable  guideline  for 
relatively  small  samples  also, 

F or  example,  assume  it  is  desired  to  estimate  the  unknown  portion 


of  mean  time  to  fatigue  failure  (with  y  assumed  known)  where  a 
normalized  standard  error  of  20%  is  allowed.  Then 


(82) 
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VZN  -  1  = 


As  one  can  see  from  this  example,  obtaining  fairly  good  estimates  of 
fatigue  life  will  require  extensive  panel  and  other  structural  testing.  To 
work  the  formula  the  other  way,  assume  N  =  20  panels  are  tested  to  failure. 
Then  the  normalized  standard  error  is 


«l  = 


1 

V2N  -  1 


=  .16 


In  thes6  estimates,  the  minimum  time  to  failure,  y,  has  been  assumed 
known.  Inclusion  of  this  figure  will  reduce  percentage  errors.  This  reduc¬ 
tion  will  be  significant  if  y  is  large  compared  to  a. 

The  main  reason  for  the  assumptions  of  known  p  and  known  y  are 
that  considerably  more  complication  enters  the  estimation  procedures 
when  all  three  parameters  are  to  be  estimated  from  the  data.  The  above 
discussed  procedures  provide  suitable  guidelines  for  fatigue  experiment 
planning.  After  a  fair  amount  of  data  is  collected,  revised  estimates  can 
be  obtained  for  the  parameters  and  more  accurate  procedures  can  be  used 
for  subsequent  experimental  program  design. 
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10.  7  EXAMPLE  OF  FATIGUE  TEST 

The  following  data  are  obtained  from  an  actual  sonic  fatigue  test 
result  of  a  certain  panel. 


Time  to  Failure 
in  100  sec  (T^^) 

log  (T.  -  y) 

^og  (T.  -  y)J^ 

35 

2.  71 

7.  34 

37 

2.  83 

8.  01 

38 

2.  89 

8.  35 

33 

2.  56 

6.  55 

39 

2.  94 

8.  64 

37 

2.  83 

8.  01 

30 

2.  30 

5.  29 

29 

2.  20 

4.  84 

30 

2.  30 

5.  29 

27 

1. 95 

3.  80 

39 

2.  94 

8.  64 

49 

3.  37 

11.  36 

39 

2.  94 

8.  64 

23 

1.  10 

1. 21 

24 

1. 39 

1. 93 

34 

2.64 

6.  97 

26 

1. 79 

3.  20 

28 

2.  08 

4.  33 

29 

2.  20 

4.  84 

30 

2.  30 

5.29 

Total  656 

48.  26 

122.  53 

Table  2.  Summary  of  Test  Data 


Assume  that  it  is  guaranteed  or  known  from  long  experience  that  the 
panel  never  fails  before  2000  sec.  Thus,  let  y  =  20  and  let  =  T^  -  y 
for  each  i.  Then 


(Slog  x.)^  =  2329.03  SlogXj^  =  122.53 


Using  Eq.  (64), 


Tr^(19) 


122.  53  -  116.45 


i 


=  iT. 


14  =  2.  27 
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The  estimated  p  value  of  2.  27  indicates  that  although  the  fatigue  failure 
distribution  is  almost  symmetrical  (see  Figure  11),  the  distribution  has 
slight  positive  skew.  A  plot  of  the  test  data  also  confirms  the  positive  skew 
and  a  normal  distribution  may  be  a  poor  model  in  this  example  as  in  many 
fatigue  test  analyses.  Also,  it  indicates  that  the  failure  rate  is  a  sharply 
increasing  function  of  time  (see  Figure  8).  Now  assuming  that  is  the 
true  value  or  can  be  estimated  by  Eq,  (59).  A  simple  approximation  can  be 
obtained  by  Eq.  (53)  as  follows. 


A 

or 


fo 


time 


20 


30 


40 


Figure  11.  Distribution  of  Failure  Time 


In  the  above  example 


A  — 

p.  =  T 


ST. 


=  32.  8  =  20  +  (  .394)  !  a 


A  12.8 
or  ^ - 


14.41 
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Thus,  desired  density  function  as  a  mathematical  model  is  estimated  by 


w{t)  =  Z.27 


(t  -  20) 


1.27 


r  t-20  T 

LroiJ 


2.  27 


14.41 


2.27 


for  t  >  20 


=  0 


for  t  <  20 


The  reliability  function  is  estimated  by 

rt-20 

A  "Ll4.4lJ 

R(t)  =  e  for  t  >  20 

=0  for  t  <  20 


In  the  following  reliability  functions  based  on  the  Weibull  and  normal  distri¬ 
butions  are  denoted  R(t)  and  respectively,  and  the  actual  percent  of 

sample  panels  which  have  not  failed  at  time  t  is  denoted  R  (t).  By  the 

8 

definition  of  reliability, 


"n"' 


OO 


iiii 


O' 


dt 


(86) 


where  ii  is  the  mean  and  o-  is  the  standard  deviation  of  the  normal 
distribution. 

In  the  above  example,  T  =  32.  8  and  s  =  6.  38.  Thus, 
estimated  by 
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1 


e 


dx 


CO 


I 

(t-32,8)/6.38 


Vi; 


2 

X 

2 


The  actual  percent  of  samples  that  did  not  fail  at  time  t  is 


R  (t)  =  I 
s 


number  of  failure  before  t 
sample  siz^ 


The  following  table  compares  three  estimates. 


t 

A 

R(t) 

R  (t) 

s 

20 

1.  00 

.  98 

1.  00 

25 

.  91 

.  89 

.  90 

27 

.  82 

.  82 

.  85 

30 

.  65 

.  67 

.  65 

33 

.45 

.49 

.  50 

35 

.  34 

.  36 

.40 

38 

. .  19 

.  21 

.  25 

40 

.  12 

.  13 

.  05 

45 

.  03 

.  03 

.  05 

Table  3.  Estimates  of  Reliabilities 


For  example,  the  probability  that  the  panel  survives  3000  seconds  is  ,  65. 
In  actual  sample  of  20  tests,  65%  of  the  panels  survived  3000  seconds. 

In  order  to  describe  the  failure  probability  of  the  panel,  the  survival 
curve  which  is  a  graph  of  R{t)  is  given  in  Figure  12. 
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R(t) 


Time 
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11.  RECOMMENDATIONS 


The  properties  of  the  Weibull  distribution  have  been  discussed- 
indications  are  that  the  Weibull  distribution  is  suitable  as  a  mathematical 
model  of  life  length  when  the  three  parameters  are  properly  estimated. 

The  best  procedure  may  be  to  estimate  the  parameters  from  the  simul¬ 
taneous  equations  given  in  Section  10.4,  However ^  the  analytical  approach 
does  not  yield  explicit  solutions.  Thus,  development  of  a  computer  program 
is  recommended  to  provide  iterative  solutions - 

More  research  is  recommended  to  improve  the  method  of  estimation 
of  the  Weibull  parameters.  The  properties  of  estimators  such  as  consistency, 
efficiency,  and  asymptotic  distribution  should  be  investigated. 

Some  decision  procedures  for  distinguishing  between  two  sets  of  life 
data  in  terms  of  reliability  have  been  presented.  This  is  another  area  which 
needs  further  study  for  more  useful  applications.  The  applications  of  the 
minimax,  Bayes,  and  other  decision  procedures  may  be  exploded. 

Additional  studies  in  some  areas  of  sample  size  reduction  would  be 
fruitful.  Clearly,  it  is  laborious  to  reduce  very  large  sample  size  by  the 
technique  described  in  Section  3.  One  has  to  compute  a  sequence  of  means 
and  ranges  in  the  process  of  eliminating  outliers.  Then  one  needs  a  random 
number  table.  The  technique  is  recursive  in  nature  and  as  a  result  it  is 
extremely  suitable  to  adopt  a  computer  solution.  The  above  method  can  be 
easily  and  compactly  programmed  for  a  computer  solution.  It  is  recommended 
that  basic  methods  of  sampling  be  studied  further.  Such  criteria  as  relative 
efficiency,  net  relative  efficiency,  costs,  bias,  and  practicality  should  be 
evaluated  for  useful  applications. 
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formation  in  »he  paragraph,  represented  as  (TS).  (S},  (C).  or  (U) 

There  is  no  limitation  on  the  length  of  the  abstract*  How¬ 
ever,  the  suggested  length  is  from  150  to  225  words. 

14.  KEY  WORDS:  Key  words  are  tech*iically  meaningful  terms 
or  short  phrases  that  characterize  a  report  and  may  be  used  as 
index  entries  for  cataloging  the  report.  Key  words  must  he 
selected  so  that  no  security  classification  is  required.  Identi¬ 
fiers,  such  as  equipment  model  designation,  trade  name,  military 
project  code  name,  geographic  location,  may  be  used  as  key 
words  but  will  be  followed  by  an  indication  of  technical  con¬ 
text,  The  assignment  of  links,  rules,  and  weights  is  optionaL 
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